Monte Carlo Sampling for Regret Minimization in Extensive Games

نویسندگان

  • Marc Lanctot
  • Kevin Waugh
  • Martin Zinkevich
  • Michael H. Bowling
چکیده

Sequential decision-making with multiple agents and imperfect information is commonly modeled as an extensive game. One efficient method for computing Nash equilibria in large, zero-sum, imperfect information games is counterfactual regret minimization (CFR). In the domain of poker, CFR has proven effective, particularly when using a domain-specific augmentation involving chance outcome sampling. In this paper, we describe a general family of domain-independent CFR sample-based algorithms called Monte Carlo counterfactual regret minimization (MCCFR) of which the original and poker-specific versions are special cases. We start by showing that MCCFR performs the same regret updates as CFR on expectation. Then, we introduce two sampling schemes: outcome sampling and external sampling, showing that both have bounded overall regret with high probability. Thus, they can compute an approximate equilibrium using self-play. Finally, we prove a new tighter bound on the regret for the original CFR algorithm and relate this new bound to MCCFR’s bounds. We show empirically that, although the sample-based algorithms require more iterations, their lower cost per iteration can lead to dramatically faster convergence in various games.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Supplemental Material for Monte Carlo Sampling for Regret Minimization in Extensive Games

The supplementary material presented here first presents a detailed description of the MCCFR algorithm. We then give proofs to Theorems 3, 4, and 5 from the submission Monte Carlo Sampling for Regret Minimization in Extensive Games. We begin with some preliminaries, then prove a general result about all members of the MCCFR family of algorithms (Theorem 18 in Section 6). We then use that result...

متن کامل

Omputation and D Ecision - M Aking in L Arge E Xtensive F Orm G Ames

In this thesis, we investigate the problem of decision-making in large two-player zero-sumgames using Monte Carlo sampling and regret minimization methods. We demonstrate fourmajor contributions. The first is Monte Carlo Counterfactual Regret Minimization (MC-CFR): a generic family of sample-based algorithms that compute near-optimal equilibriumstrategies. Secondly, we develop a...

متن کامل

Efficient Monte Carlo Counterfactual Regret Minimization in Games with Many Player Actions

Counterfactual Regret Minimization (CFR) is a popular, iterative algorithm for computing strategies in extensive-form games. The Monte Carlo CFR (MCCFR) variants reduce the per iteration time cost of CFR by traversing a smaller, sampled portion of the tree. The previous most effective instances of MCCFR can still be very slow in games with many player actions since they sample every action for ...

متن کامل

Generalized Sampling and Variance in Counterfactual Regret Minimization

In large extensive form games with imperfect information, Counterfactual Regret Minimization (CFR) is a popular, iterative algorithm for computing approximate Nash equilibria. While the base algorithm performs a full tree traversal on each iteration, Monte Carlo CFR (MCCFR) reduces the per iteration time cost by traversing just a sampled portion of the tree. On the other hand, MCCFR’s sampled v...

متن کامل

Online Monte Carlo Counterfactual Regret Minimization for Search in Imperfect Information Games

Online search in games has been a core interest of artificial intelligence. Search in imperfect information games (e.g., Poker, Bridge, Skat) is particularly challenging due to the complexities introduced by hidden information. In this paper, we present Online Outcome Sampling, an online search variant of Monte Carlo Counterfactual Regret Minimization, which preserves its convergence to Nash eq...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2009